Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 0381120210430091059
Genes and Genomics
2021 Volume.43 No. 9 p.1059 ~ p.1064
Enhancing performance of gene expression value prediction with cluster-based regression
Seok Ho-Sik

Abstract
Background: The inherent correlations among gene expressions have received attention. Recently, it was reported that a set of approximately 1000 landmark genes can be utilized for prediction of expression of other genes (target genes).

Objective: The objective of this study is to predict expression values of target genes based on expression values of landmark genes.

Methods: A cluster-based regression method is proposed. In the proposed method, clusters are obtained from a set of training instances of a gene and an estimator is obtained per cluster. A test instance is assigned to one of clusters then a regression model corresponding to the cluster predicts expression value.

Results: Performance of the proposed method is measured on the GEO (Gene Expression Omnibus) expression data and the GTEx (Genotype-Tissue Expression) expression data. In terms of mean absolute error averaged across target genes, the proposed method significantly outperforms previous approaches in the case of the GEO expression data.

Conclusions: The experimental results report that the combination of clustering and regression can outperform the state-of-the art methods such as generative adversarial networks and a gradient boosting based method.
KEYWORD
Clustering, Gene expression value prediction, Kernel ridge regression, Landmark gene, Performance enhancing, Regression
FullTexts / Linksout information
Listed journal information
SCI(E) ÇмúÁøÈïÀç´Ü(KCI)